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Abstract 

In this correspondence information theoretical tools are used 
to investigate the statistical properties of modeled cochlear nu- 
cleus globular bushy cell spike trains. The firing patterns are ob- 
tained from a simulation software that generates sample spike 
trains from any auditory input. Here we analyze for the first 
time the responses of globular bushy cells to voiced and un- 
voiced speech sounds. Classical entropy estimates, such as 
the direct method, are improved upon by considering a time- 
varying and time-dependent entropy estimate. With this method 
we investigated the relationship between the predictability of 
the neuronal response and the frequency content in the auditory 
signals. The analysis quantifies the temporal precision of the 
neuronal coding and the memory in the neuronal response. 

1. Introduction 

The auditory system is an ideal model to study coding and pro- 
cessing of information in the neuronal system. The fidelity 
of coding is thereby remarkable: the human ear covers a dy- 
namic range larger than 120dB, a frequency range from 16 Hz 
to 16 kHz and provides temporal precision in the order of tens of 
microseconds, which allows us to localize sound in the horizon- 
tal plane with exquisite resolution. It is clear that these features 
require delicate preprocessing in the inner ear. The key-features 
are narrow-band filtering and nonlinear dynamic compression, 
which both depend on an active, mechanical feedback amplifi- 
cation process. Every location in the inner ear has its character- 
istic frequency (CF), where a receptor cell transduces mechan- 
ical vibrations into an electrical potential, which is then trans- 
ferred at the chemical synapse into all-or-none nerve-action po- 
tentials of the auditory nerve. A single receptor cell is inner- 
vated with multiple (up to 40) auditory nerve fibers (ANFs). 
ANFs are classified by their spontaneous rates into three groups: 
low-, medium- and high spontaneous rate fibers. They have 
different thresholds and cover different dynamic ranges. Neu- 
ronal processing starts in the first station of the central nervous 
system, the cochlear nucleus (CN). It consists of several neu- 
ron types that receive direct inputs from ANFs and show var- 
ious firing properties. Here we focus on globular bushy cells 
(GBC) that are one of the principal cells in CN and known for 
their precise temporal coding (see [3 j for an overview). They 
are involved in the sound localization pathway and are among 
the fastest neurons in our brain. They lock on low-frequency 
tones and amplitude modulated signals like speech. GBCs re- 



ceive input from multiple ANFs through giant synapses (end- 
bulbs of Held). They suppress spontaneous activity and en- 
hance temporal precision by coincidence detection. Compared 
to ANFs, their responses are much more reliable, which makes 
them ideal candidates for information calculations. Information 
theory provides important insight on the coding performed by 
the auditory system and is a theoretical tool commonly used to 
interpret neuronal data. 

For example, entropy is used as a measure of information in a 
spike train |7|. Entropy specifies the number of bits required to 
represent a sequence of outcomes of a random source. In other 
words, it captures the richness of the random output: sources 
with low entropy produce predictable outcomes, while sources 
with high entropy are harder to predict. When evaluating the 
entropy of a spike train, we are evaluating the predictability of 
this random sequence over time. This is an indicator of the level 
of activity of the neuron, but from this measure one cannot in- 
fer much about the origin of this variability or the "information 
content" of the spike train. 

Estimating entropy is, in general, a complex task that may 
require a large amount of data and one should not embark such 
an endeavor without a good motivation to do so. 

Some of the desirable properties of entropy as a measure of 
predictibility of random variables are the following: 

Entropy does not change under injective transformations 

Any transformation of the data that can be inverted 
preserves entropy 

Entropy is bounded Entropy of discrete alphabet sequences 
is bounded between a maximum and a minimum which 
provides an absolute measure of predictability. 

Entropy can be factorized When computing the entropy of 
multiple random variables, one can divide the calcula- 
tion into sub-tasks. 

The so-called estimation of the entropy has been often per- 
formed using the direct method [7| which is actually an upper 
bound to the entropy estimate as well as being a time invariant 
measure. This estimate is accurate if the neuronal activity is 
modeled as coming from a random, memoryless source that has 
no particular relationship with a known input process. This is 
clearly not the case in many scenarios in which one can obtain 
a better estimate of the entropy by considering the time depen- 
dence and the time variability of the neuronal activity. By con- 
sidering these two refinements to the direct method we improve 
the quality of the entropy estimate. In Sec. |2]we introduce the 



model we utilize to produce neuronal firings and the auditory 
inputs we consider. In Sec. [3]we introduce fundamental prop- 
erties of the entropy. An improvement of the direct method to 
estimate the entropy is provided in Sec. |4] and in Sec. [5] we 
analyze the utterance of a vowel and a consonant by using the 
entropy estimate from Sec. |4] Finally, Sec. [6] concludes the 
paper. 

2. Model Description 

We apply an inner ear model from |8|, where we have adopted 
the middle ear to tune the model to achieve human-like hear- 
ing thresholds. The inner ear model generates random spike 
trains of ANFs, which excite GBCs. Our GBC model is real- 
ized as a single compartment model with Hodgkin-Huxley like 
ion channels (HPAC, Kht, Kit) described in 1 4]. We tuned inner- 
vation (32 high-, 4 medium- and 4 low-spontaneous rate fibers) 
and synaptic weights of our model to replicate low spontaneous 
activity, high synchronization and entrainment values (see (5)) 
according to physiological recordings from J2]. This model was 
also able to reproduce experimental results obtained from pure 
tone stimulations: PSTH, ISIH and receptive field maps. The 
model produces spike-trains to auditory inputs with a time res- 
olution of about 20.8 fisec, which were re-sampled with a 1 ms 
precision in this study. 

In the following we exemplarily analyze neuronal firing pro- 
duced by the model at a single CF for two utterances from the 
Isolated Letter Speech Recognition (ISOLET) database fj]: 

• I ay I male speaker, CF= 1.5 kHz (data-set f cmcO-Al), 

• /es/ female speaker, CF = 6.15kHz (fmbO-Sl). 

Since the data is produced by simulations, we can utilize a large 
number of samples, -10 s , in our analysis. 

Firing patterns (neurogram) of GBCs with CFs covering a 
part of the hearing range produced by the model are plotted in 
Fig. [T] The neurogram exhibits is regular for the voiced parts 
of the speech sound (/ay/ and lei), where the frikative Is/ has 
noisy components in the frequency range above about 2 kHz. 
The most striking difference of the neurogram compared to a 
spectrogram is its distinct temporal structure. During the voiced 
parts of the speech sound (layl and lei), the firing patterns lock 
to the pitch frequency of the speaker and its higher harmonics 
with high fidelity. During the frikative Isl, which has noise- 
like components in the frequency range above about 2 kHz, the 
firing pattern was irregular. 
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Figure 1 : Firing patterns of 40 globular bushy cells per charac- 
teristic frequency for /ay/ (top panel) and /es/(bottom panel). 



3. Entropy 

Entropy J6] is a measure of the uncertainty associated with a 
Random Variable (RV) X: 

H(X) = - P[X = x] log 2 P[X = x] (1) 

where log 2 is the logarithm to the base two and X is the support 
of the RV X, i.e. set of possible values of X. 

In 1 6 1, Shannon defined entropy as a suitable measure of the 
uncertainty of a random source by following an axiomatic ap- 
proach, that is by defining a series of desired properties for the 
measure be: 

Continuous in the probability distribution Small variations 
in the probability distribution of X correspond to small 
variation in H(X) 

The uniform distribution maximizes entropy This is the 
case where there is the most uncertainty on the random 
outcome 

It should factorize If a choice be broken down into two suc- 
cessive choices, the original H should be the weighted 
sum of the individual values of H 

Shannon argued that the definition of entropy in (Q], apart 
from scaling, is the only definition consistent with these axioms. 
Other important properties make the definition of entropy in l[T) 
a particularly suitable measure of random predictability. In par- 
ticular: 

Data processing inequality: H(X) > H(f(X)), where / 
is any function. That is entropy is reduced by any pro- 
cessing of the original RV. Entropy is preserved by injec- 
tive transformations in which case the above inequality 
holds with equality. 

Boundedness: < H(X) < log 2 |X| The value of entropy 
is always bounded between the entropy of a degenerate 
distribution, a constant, and the uniform distribution over 
the support, log 2 |X|. 

It is also possible to define a conditional analogous to the 
entropy in ([]}: 

H(X\Y)= Yl P[Y = y]H[X\Y = y] (2) 

An important property of H(X\Y) is the following: 

Conditioning reduces entropy: H(X\Y) < H(X), where 
equality holds if and only if Y is independent from X 

With the definitions Q} and ([2} one can express the entropy 
of t TV RVs {Xi...Xn} as a function of conditional entropies 
offhetheRVsX;: 

N 

H(X 1 ...X N )=Y, H ( X i\ X ^- X ^)' ( 3 ) 

j=i 

4. A Time- Varying and Time-Dependent 
Entropy Estimate 

Strong |7| proposed a method of evaluating the entropy by 
considering the sliding windows of size T. We take here 
the same approach and consider the problem of estimating 



H ( W\ . . . Wn ) > the entropy of a sequence of TV sliding windows 
Wi of size T. By applying ([3} we obtain 



H(W 1 ...W N ) = Y,H{W i \W 1 ...W i - 1 ) 



(4) 



From the expression in ((4) we now see the difficulty in es- 
timating the entropy of the whole firing sequence. A precise 
estimate would require us consider the conditional entropy of 
the current word given all the past words. This is in general a 
very complicated task as it requires to consider the joint distri- 
bution between one word and all the past words. Note that we 
cannot claim independence among words and drop the condi- 
tioning because the word at time i shares common bins with all 
the i — T previous words. We can assume that the system has a 
finite memory, that is the past dependence is limited to a certain 
set of windows, that is 
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for some M. One could apply the "conditioning reduces en- 
tropy" property of the conditional entropy to obtain an upper 
bound on the actual entropy estimate, that is 

H (Wi\Wi...Wi-i) < H (Wi) => (6a) 

N 

H (Wi...W N ) <^H (Wi) (6b) 

i=l 

This latter bound on the entropy estimate justifies the direct 
method of [7 1 as being an upper bound to the entropy estimate. 

4.1. A time dependent entropy estimate 

Given the loss in accuracy of the entropy estimate when ig- 
noring the time dependency among words, we propose a time- 
dependent estimate of the entropy which provides more accu- 
rate estimates than the direct method. This also results in a 
time-varying estimate of the entropy of the words that provides 
interesting insights between the entropy of the word Wi and the 
auditory input at time i. 
In particular 

• we evaluate the entropy of each word Wi across sam- 
ples and obtain an estimate that we then correlate to the 
sensorial input. We show that this time varying estimate 
provides significant insight on how the neurons code the 
sensory input. 

• For each codeword Wi we evaluate the conditional en- 
tropy given the past values of the codeword. By noting 
the variation of the entropy estimate as the length of the 
conditioning increases, we investigate the time correla- 
tion among firings and, implicitly, estimate the memory 
of the random process. 

• We use the time-varying and time-dependent estimate of 
the entropy to obtain a better estimate of the entropy of 
the overall neuronal firing. 

5. Numerical Results 

In this section we present a set of numerical evaluation of the 
entropy estimate defined in Sec. [4] We begin by showing how 
the entropy estimate correlates to the log-amplitude of the fre- 
quency content at a given frequency. In Fig. |2]we plot the short- 
term power spectrum at 1.5 kHz for the utterance /ay/ (top 



panel) together with the average firing probability (mid panel) 
and the entropy estimate (lower panel). For the entropy estimate 
we consider a window of length 10 ms and we plot both the non- 
conditional version (black) of |[T) and the conditional version 
(grey) of l[2}, for which we use a conditioning over the past 20 
windows. From the plot it is clear that entropy estimates corre- 
lates to the frequency content at both low signal amplitudes, for 
the interval — 0.1 s and 0.3 — 0.4 s, as well as high signal am- 
plitudes, for the interval 0.1 — 0.23 s. The precise information 
on the temporal evolution of the frequency content is not pro- 
vided by the firing probability. This is an instantaneous measure 
of the neuronal activity that does not provide much information 
about the predictability of the response over time. The condi- 
tional entropy estimate performs better than the non-conditional 
version as it is smoother in the interval . 1 — . 2 s and better fol- 
lows the log-amplitude of the signal. This is so because, in this 
interval, the probability of firing is very high and one obtains a 
better estimate by considering the correlation between the value 
of the current window and the past ones. 
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Figure 2: Log-amplitude, average firing probability (1 ms time 
bins),non-conditional (black) and conditional (grey) entropy es- 
timate for /ay/ at 1.5kHz. Analysis window length: 10 ms. 

Fig. H]is the analogous to Fig.[2]for the utterance of /es/. As 
for Fig. |2]one observes a very precise time correlation between 
the entropy estimate and the log-amplitude of the frequency 
content. For this case it is interesting to notice the behavior 
of the estimate in the interval 0.3 — 0.4 s. Even when the am- 
plitude of the signal is large, the firing probability is very low, 
despite of this the entropy estimate does not decay and preserve 
a good similarity to the amplitude content. Another interesting 
detail is that the entropy decreases during the vowel part due 
to the regular firing pattern. For the noise-like frikative / s/, the 
entropy rate is higher copmared to the vowel part. 

In Fig. [4] we study the impact of the window length on the 
entropy estimate for the /ay/ at 1.5 kHz. For small windows 
the entropy estimate is very localized but suffers of great varia- 
tions, as the predictability of a few bits can greatly increase the 
estimate. For longer windows one obtains a smoother estimate 
but looses time resolution. 

Finally we plot the cumulative difference between the en- 
tropy estimate and the conditional entropy estimate for different 
length of the conditioning. For this plot we again consider the 
utterance /ay/ at 1.5 kHz and a window length of 10ms. The 
error between H(Wi) and H(Wi\Wi-L---Wi-\) as in ( f6bT > is 



100 




0.1 0.2 0.3 0.4 0.5 0.6 

Time s 



Figure 3: Log-amplitude, average firing probability and entropy 
estimate for /es/ at 6. 15 kHz 



w o 









ms sliding window 






1 15 ms sliding window 






50 ms sliding window 



, I 1 1 1 1 1 1 1 — 

0.05 0.1 0.15 0.2 0.25 0.3 0.35 
Time (s) 



Figure 4: The effect of the window length on the entropy esti- 
mate. 

plotted for L — 4,9 and 20 ms. This difference indicates the re- 
finement of the entropy estimate when using a time-dependent 
entropy estimate. The impact of the conditioning also indicates 
the memory of the neuronal process. A decrease in the entropy 
estimate for small values of L is expected as we are accounting 
for the correlation of overlapping windows, that is windows that 
share the same bins. A decrease of the conditional entropy for 
longer windows, instead, indicates a dependency among win- 
dows that do not share any bin and thus indicates a time depen- 
dency in the process. This is indeed what we observe in Fig. 
[4] the conditioning for L = 20 ms indicates that in this case 
neuronal firings are correlated up to 20 ms. 

6. Conclusions 

In this correspondence we introduce a time-dependent and time- 
varying entropy estimate as a measure of the predictability of 
neuronal responses. We have for the first time analyzed the re- 
sponses of globular bushy cells in the cochlear nucleus to voiced 
and unvoiced speech sounds. These neurons extract temporal 
features of sound signals with high temporal fidelity, which is 
reflected in the extraordinary high entropy rates of their firing 
patterns. Our method is particularly suitable for analyzing the 
neuronal response as it allows one to retain a high temporal res- 
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Figure 5: Cummulative error between between H(Wi) and 
H(Wi\Wi- L ...Wi-i). 

olution and yet consider long time dependencies in the signal. 
This could not be attained with previous estimation approaches 
which would have either a precise temporal resolution, using 
short estimation windows, or account for time dependencies, 
using long estimation windows. 
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